182 research outputs found
Rumba : a Python framework for automating large-scale recursive internet experiments on GENI and FIRE+
It is not easy to design and run Convolutional Neural Networks (CNNs) due to: 1) finding the optimal number of filters (i.e., the width) at each layer is tricky, given an architecture; and 2) the computational intensity of CNNs impedes the deployment on computationally limited devices. Oracle Pruning is designed to remove the unimportant filters from a well-trained CNN, which estimates the filters’ importance by ablating them in turn and evaluating the model, thus delivers high accuracy but suffers from intolerable time complexity, and requires a given resulting width but cannot automatically find it. To address these problems, we propose Approximated Oracle Filter Pruning (AOFP), which keeps searching for the least important filters in a binary search manner, makes pruning attempts by masking out filters randomly, accumulates the resulting errors, and finetunes the model via a multi-path framework. As AOFP enables simultaneous pruning on multiple layers, we can prune an existing very deep CNN with acceptable time cost, negligible accuracy drop, and no heuristic knowledge, or re-design a model which exerts higher accuracy and faster inferenc
RawHDR: High Dynamic Range Image Reconstruction from a Single Raw Image
High dynamic range (HDR) images capture much more intensity levels than
standard ones. Current methods predominantly generate HDR images from 8-bit low
dynamic range (LDR) sRGB images that have been degraded by the camera
processing pipeline. However, it becomes a formidable task to retrieve
extremely high dynamic range scenes from such limited bit-depth data. Unlike
existing methods, the core idea of this work is to incorporate more informative
Raw sensor data to generate HDR images, aiming to recover scene information in
hard regions (the darkest and brightest areas of an HDR scene). To this end, we
propose a model tailor-made for Raw images, harnessing the unique features of
Raw data to facilitate the Raw-to-HDR mapping. Specifically, we learn exposure
masks to separate the hard and easy regions of a high dynamic scene. Then, we
introduce two important guidances, dual intensity guidance, which guides less
informative channels with more informative ones, and global spatial guidance,
which extrapolates scene specifics over an extended spatial domain. To verify
our Raw-to-HDR approach, we collect a large Raw/HDR paired dataset for both
training and testing. Our empirical evaluations validate the superiority of the
proposed Raw-to-HDR reconstruction model, as well as our newly captured dataset
in the experiments.Comment: ICCV 202
A Survey on Passing-through Control of Multi-Robot Systems in Cluttered Environments
This survey presents a comprehensive review of various methods and algorithms
related to passing-through control of multi-robot systems in cluttered
environments. Numerous studies have investigated this area, and we identify
several avenues for enhancing existing methods. This survey describes some
models of robots and commonly considered control objectives, followed by an
in-depth analysis of four types of algorithms that can be employed for
passing-through control: leader-follower formation control, multi-robot
trajectory planning, control-based methods, and virtual tube planning and
control. Furthermore, we conduct a comparative analysis of these techniques and
provide some subjective and general evaluations.Comment: 18 pages, 19 figure
Distributed Control for a Multi-Agent System to Pass through a Connected Quadrangle Virtual Tube
In order to guide the multi-agent system in a cluttered environment, a
connected quadrangle virtual tube is designed for all agents to keep moving
within it, whose basis is called the single trapezoid virtual tube. There is no
obstacle inside the tube, namely the area inside the tube can be seen as a
safety zone. Then, a distributed swarm controller is proposed for the single
trapezoid virtual tube passing problem. This issue is resolved by a gradient
vector field method with no local minima. Formal analyses and proofs are made
to show that all agents are able to pass the single trapezoid virtual tube.
Finally, a modified controller is put forward for convenience in practical use.
For the connected quadrangle virtual tube, a modified switching logic is
proposed to avoid the deadlock and prevent agents from moving outside the
virtual tube. Finally, the effectiveness of the proposed method is validated by
numerical simulations and real experiments.Comment: 12 pages,14 figures. arXiv admin note: substantial text overlap with
arXiv:2112.0100
Distributed Control within a Trapezoid Virtual Tube Containing Obstacles for UAV Swarm Subject to Speed Constraints
For guiding the UAV swarm to pass through narrow openings, a trapezoid
virtual tube is designed in our previous work. In this paper, we generalize its
application range to the condition that there exist obstacles inside the
trapezoid virtual tube and UAVs have strict speed constraints. First, a
distributed vector field controller is proposed for the trapezoid virtual tube
with no obstacle inside. The relationship between the trapezoid virtual tube
and the speed constraints is also presented. Then, a switching logic for the
obstacle avoidance is put forward. The key point is to divide the trapezoid
virtual tube containing obstacles into several sub trapezoid virtual tubes with
no obstacle inside. Formal analyses and proofs are made to show that all UAVs
are able to pass through the trapezoid virtual tube safely. Besides, the
effectiveness of the proposed method is validated by numerical simulations and
real experiments.Comment: 11 pages, 12 figure
Hybrid Spectral Denoising Transformer with Guided Attention
In this paper, we present a Hybrid Spectral Denoising Transformer (HSDT) for
hyperspectral image denoising. Challenges in adapting transformer for HSI arise
from the capabilities to tackle existing limitations of CNN-based methods in
capturing the global and local spatial-spectral correlations while maintaining
efficiency and flexibility. To address these issues, we introduce a hybrid
approach that combines the advantages of both models with a Spatial-Spectral
Separable Convolution (S3Conv), Guided Spectral Self-Attention (GSSA), and
Self-Modulated Feed-Forward Network (SM-FFN). Our S3Conv works as a lightweight
alternative to 3D convolution, which extracts more spatial-spectral correlated
features while keeping the flexibility to tackle HSIs with an arbitrary number
of bands. These features are then adaptively processed by GSSA which per-forms
3D self-attention across the spectral bands, guided by a set of learnable
queries that encode the spectral signatures. This not only enriches our model
with powerful capabilities for identifying global spectral correlations but
also maintains linear complexity. Moreover, our SM-FFN proposes the
self-modulation that intensifies the activations of more informative regions,
which further strengthens the aggregated features. Extensive experiments are
conducted on various datasets under both simulated and real-world noise, and it
shows that our HSDT significantly outperforms the existing state-of-the-art
methods while maintaining low computational overhead. Code is at https:
//github.com/Zeqiang-Lai/HSDT.Comment: ICCV 202
Improving Person Re-identification by Attribute and Identity Learning
Person re-identification (re-ID) and attribute recognition share a common
target at learning pedestrian descriptions. Their difference consists in the
granularity. Most existing re-ID methods only take identity labels of
pedestrians into consideration. However, we find the attributes, containing
detailed local descriptions, are beneficial in allowing the re-ID model to
learn more discriminative feature representations. In this paper, based on the
complementarity of attribute labels and ID labels, we propose an
attribute-person recognition (APR) network, a multi-task network which learns a
re-ID embedding and at the same time predicts pedestrian attributes. We
manually annotate attribute labels for two large-scale re-ID datasets, and
systematically investigate how person re-ID and attribute recognition benefit
from each other. In addition, we re-weight the attribute predictions
considering the dependencies and correlations among the attributes. The
experimental results on two large-scale re-ID benchmarks demonstrate that by
learning a more discriminative representation, APR achieves competitive re-ID
performance compared with the state-of-the-art methods. We use APR to speed up
the retrieval process by ten times with a minor accuracy drop of 2.92% on
Market-1501. Besides, we also apply APR on the attribute recognition task and
demonstrate improvement over the baselines.Comment: Accepted to Pattern Recognition (PR
- …